Matrix Holds Key To On-chip Interconnect

Editorial
Today's News
News Archives
On-line Articles
Current Issue
Magazine Archives
Subscribe to ISD

Directories:
Vendor Guide 2001
Advertiser Index
Event Calendar

Resources:
Resources and Seminars
Special Sections

Information:
2001 Media Kit
About isdmag.com
Search isdmag.com
Contact Us

Matrix Holds Key To On-chip Interconnect

By Bruce Mathewson and Jonathan Morris
Integrated System Design
Posted 10/03/01, 11:27:08 AM EDT

The advent of 0.13-micron and even 0.1-micron semiconductor manufacturing processes has given the system-on-a-chip designer many more gates with which to work. But in a lot of cases, both the system design and performance are limited by the complexity of the interconnection between the different modules and blocks that are integrated into those chips.

The Amba bus has been used widely in the development of system-on-chip designs-often, but not necessarily, based around the ARM processor cores-and is now moving to a more sophisticated level of hierarchy in interconnect that is set to fundamentally change the way SoC interconnect is designed.

The current version of the Amba bus arrived in 1999 with the 2.0 specification and introduced the Advanced High Performance Bus (AHB), now the mainstay of the Amba implementation. A key point of the AHB definition is that it defines both the interface and the interconnect, with both intended to allow the maximum bandwidth from any given process technology. While one of the interconnect elements of the AHB is a traditional shared bus with master and slave blocks, developments in the last year have split the interface from the interconnect in a move that has significant implications for the interconnection of blocks on a chip. Amba is now no longer about a bus, but a hierarchy of interconnects with the interface block as the keystone.

AHB-Lite, launced at DATE 2001, defines a pure subset of the AHB interface without the multiple-bus-master capability. That is a key point, as the specification is only taking some elements, mainly the arbitration protocols, out of the original AHB specification. Those arbitration protocols have essentially moved into the interconnect along with the mundane interconnect elements such as decoders and multiplexers to simplify the design of the system interconnect.

The specification has always been open, with license terms that enable designers to develop their own Amba interfaces free of charge, and the move to AHB-Lite will make that process more reliable and testable. AHB-Lite is an interface definition and allows the designer of IP components to use a simple interface specification without the need to concern themselves with the details of the interconnect that is used to join components together (Fig. 1).

Basic systems can be built up using only AHB-Lite, which contain only one bus master, with one source of address, control and write data, so that multiple master-slave multiplexers aren't needed (Fig. 2). As a result, there is no HBUSREQx output or HGRANTx input, and where those exist (as in the original AHB interface), the former is left unconnected and the latter is tied HIGH. There is also no support for the Split or Retry response and only the OKAY and ERROR responses are required. That

means the master also doesn't need the HRESP[1] input.

The only remaining arbitration-class signal is the HLOCK output from masters. If a master has an HLOCK output, then it is retimed to generate the HMASTLOCK signal instead. This lock is still required, since the master may be performing a transfer to a multiport slave, which needs to be aware of locked access.

By far the majority of AHB slaves will be fully compatible with AHB-Lite. The only special circumstance is when slaves make use of the Split or Retry response options. In that case, a standard predesigned wrapper can be added around the slave to suppress those response options.

Compatibility between AHB and AHB-Lite is critical for companies such as LSI Logic, which has standardized its CoreWare methodology around the Amba specification and for whom the prospect of redesigning the existing library is not an option. The interchangeability is shown in Table 1.

Linking strategies
But the key difference is that now the blocks, whether they are master or slave, can be linked with a variety of interconnect strategies to get the maximum bandwidth at the appropriate place in the chip design. That frees the RTL designer to concentrate on the system topology and performance and on the content of the blocks, rather than on the mundane and repetitive development of the bus interface blocks.

This approach also enables the use of a set of verification tools and testbenches to ease the verification process for the blocks, an increasingly important part of the design process.

The first level of the interconnect hierarchy for AHB is the simple point-to-point link. This takes the 32-bit address bus and a variable data bus width from 8 bits to 1,024 bits, running from the output of a master AHB-Lite block to the input of a slave AHB-Lite interface block.

It turns out that a significant portion of an embedded-system design can make use of this simpler approach-for example, linking the microprocessor or digital signal processor and the memory subsystem without having to use a full-blown bus. Using the AHB-Lite interface for such connections also makes it possible to use standard verification and test tools, speeding up that part of the design process.

While most blocks are simply a master, such as a microprocessor core, and others are simply a slave, such as a memory block, other blocks may combine the two requirements.

A direct memory access (DMA) block is a slave while it is being programmed but has to be a master to move data through the system.

Such a block would have two AHB-Lite interfaces, one master for connecting to a system bus or other interconnect structure, with a slave AHB-Lite interface connected either directly or over the bus to the controlling processor. Using AHB-Lite in the traditional bus requires that an arbiter block be linked to the controlling processor to control the access to the bus by the various masters. While the specification of the arbiter is part of the Amba bus specification, the algorithm that it uses is up to the RTL designer, with the two most popular being the "fixed priority" and "the round robin".

To assist in the design process, example templates for the common Amba bus components can be found in the recently in ARM's recently released Amba Design Kit (ADK).

Other issues that come up include handling error responses just as one master is handing the bus over to another, and making sure that the error message is applied to the first master block, not the second. By using a template, such as the Example Bus Master Design supplied in the ADK, the developer can avoid the need to deal with such detailed corner cases.

Ensuring that the design of a component meets all the corner case requirements is only part of the process. Test cases also need to be developed to check all of those scenarios.

ARM's Amba Compliance Testbench (ACT) includes test cases arising from the combination of signals such as bus granted-retry-error that are not necessarily exercised in a custom design but are vital if the bus is to make use of third-party blocks that use the AHB interface, or blocks from other parts of the organization. It also includes coverage cases such as error responses on the first and last transfer of a burst, again to simplify the testing and ensure that the implementation of the interface is consistent with the Amba specification. The Amba compliance testbench includes a protocol checker for Amba signals but can also integrate non-Amba signals, with an HDL-independent testbench.

Up to 16 masters and any number of slave units can sit on the bus, but for more than 16, another layer of hierarchy is necessary. While blocks such as an AHB-AHB bridge have been developed to handle such large systems, a new hierarchical approach has been developed that is expected to become increasingly important as the complexity of a system-on-a-chip grows.

Simplified process
It is this third layer of the interconnect hierarchy that provides the most opportunity for the RTL designer. It uses multiple logical layers and predesigned interconnect structures to simplify the design process. The multilayer version of AHB provides a connection from any AHB master to any AHB slave though a network of multiplexer blocks, effectively avoiding the need for a central arbiter (Fig. 3). That also avoids issues with latency and bus contention, with the only contention situation being if two masters want to link with the same slave at exactly the same time. This is a far cry from contention on a traditional bus structure, which can cost tens of cycles.

That means the interconnect can be considered as just another AHB component, combining the arbitration, decoders and multiplexers in a single unit and dramatically simplifying the design and layout of complex systems.

The interconnect matrix provided in the Amba design kit as RTL handles up to eight masters and eight single-port slaves with 32-bit address and data buses. The master sends out the address and the interconnect matrix delivers the data back from the slave.

But the system can be partitioned to mix slaves that require local and global access from the master blocks, as in Fig. 4.

A multifunction peripheral such as a DMA block could be attached to a master interface on one side and to one of the slave ports on the other side, or could use a direct connection from another master to the slave port, combining the point-to-point and multilayer approaches.

Similarly, a dual-port slave (Fig. 5) could use two of the slave ports off the interconnect block if access to all the master units is needed, or use direct connections to the relevant master units instead.

It then becomes a gate trade-off, with two additional ports on the interconnect matrix adding a few thousand gates. It also provides significantly higher bandwidth at specific points in the design, allowing systems to be designed with lower clock frequencies and, therefore, higher yield and reliability.

While the current parallel data bus width is 32 bits, there is no reason other than the available gate budget why it could not be increased to 64 bits or 128 bits to further increase the bandwidth, increasing the size of the multiplexers to compensate.

That is an important point. Many on-chip buses evolved from printed-circuit boards, where the aim was to reduce the pin count of the components. In an ASIC design that is no longer relevant, and the bus bandwidth is only really determined by the gate budget.

The multilayer AHB implementation and the AHB-Lite interface emerged from the same internal exercise, since once the arbitration and bus logic is submerged in the interconnect, the interface for the functional block is dramatically simplified, suitable for either a point-to-point link or connection to an interconnect matrix.

The future
The future will get even more interesting, with the output of one interconnect matrix becoming the input of the next, acting in effect like a bridge or a series of bridges to link different parts of the system depending on the performance requirements and the gate budget (Fig. 6).

So the bus structures that were the "glue" of early system-on-chip designs are evolving into more sophisticated, standard components with simple interfaces that can be simply verified. That allows the RTL designer to concentrate on the overall topology of the system, to get the best performance, and on the detail of the key blocks within that system.

---

ARM's Amba specification for on-chip interconnect is taking off in some surprising directions, as described by Bruce Mathewson, Amba technical lead, and Jonathan Morris, Amba marketing manager, both at ARM.

(Cambridge, U.K.).

http://www.isdmag.com

Sponsor Links